Housing_Pipeline.ipynb
- EXPLORATORY DATA ANALYSIS AND FEATURE ENGINEERING
- Cleaning and converting the Amount column to a numerical data
- Cleaning the Carpet area
- Cleaning the Bathroom column
- Cleaning the Balcony column
- Cleaning the Super Area column
- Dropping unecessary columns
- THE NUMERICAL DATA
- CATEGORICAL DATA
- OUTLIERS REMOVAL USING ZSCORE,IQR AND BOXPLOT
- Describe data
- CHECKING THE FREQUENCIES OF AMOUNT OF DIFFERENT FURNISHING FEATURES
- THE NUMBER OF BALCONY SHOULD NOT BE MORE THAN TWICE THE NUMBER OF BATHROOMS
- Using zcsore for outliers removal
- Using IQR for outlier removal, for Price and Amount
- Using a boxplot to clean the Carpet_Area_sqft and Super_Area_sqft columns
- VISUALISATION ANALYSIS OF THE DATA
- USING HISTOGRAM TO CHECK FOR FREQUENCIES
- Using scatter plot to check for relationship between columns
- USING HEATMAP TO CHECK THE CORRELATION OF THE NUMERICAL COLUMNS OF THE DATA
- Plot relational pair plot to check relationship
- USING PREPROCESSING TECHNIQUES TO HANDLE NULL , TEXT, CATEGORICAL AND NUMERICAL VALUES
- Saving my cleaned data
- PREPROCESSING, PIPELINE AND MODEL BUILDING
- DIVIDE THE DATA INTO TRAIN(70%) ,VALIDATE(15%) AND TESTING(15%)
- Creating the pipeline
- Creating ColumnTransformer
- Model Hyperparameter tuning and evaluation using metrics
- Make_pipeline to join or link the transformer and voting
- LINKING THE TRANSFORMER AND VOTING PIPES USING make_pipeline
- MODEL EVALUATION
- SAVING THE MODEL USING JOBLIB LIBRARY
- File
- Edit
- View
- Run
- Kernel
- Settings
- Help
Kernel Connecting
Kernel status: Connecting
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
- 240
- 241
- 242
- 243
- 244
- 245
- 246
- 247
- 248
- 249
- 250
- 251
- 252
- 253
[ ]:
[5]:
(187531, 21)
[6]:
| Index | Title | Description | Amount(in rupees) | Price (in rupees) | location | Carpet Area | Status | Floor | Transaction | ... | facing | overlooking | Society | Bathroom | Balcony | Car Parking | Ownership | Super Area | Dimensions | Plot Area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 43350 | 43350 | 2 BHK Ready to Occupy Flat for sale Sembakkam | This beautiful 2 BHK Flat is available for sal... | 54.5 Lac | 5800.0 | chennai | 800 sqft | Ready to Move | 2 out of 3 | New Property | ... | North | Main Road | NaN | 2 | 1 | 1 Covered | Freehold | NaN | NaN | NaN |
| 11814 | 11814 | 5 BHK Ready to Occupy Flat for sale in Adani S... | This gorgeous 5 BHK Flat is available for sale... | 3.65 Cr | NaN | ahmedabad | 4734 sqft | Ready to Move | 16 out of 16 | Resale | ... | East | NaN | Adani Shantigram Waterlily | 5 | 1 | 2 Covered, | Freehold | NaN | NaN | NaN |
2 rows × 21 columns
<class 'pandas.core.frame.DataFrame'> RangeIndex: 187531 entries, 0 to 187530 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Index 187531 non-null int64 1 Title 187531 non-null object 2 Description 184508 non-null object 3 Amount(in rupees) 187531 non-null object 4 Price (in rupees) 169866 non-null float64 5 location 187531 non-null object 6 Carpet Area 106858 non-null object 7 Status 186916 non-null object 8 Floor 180454 non-null object 9 Transaction 187448 non-null object 10 Furnishing 184634 non-null object 11 facing 117298 non-null object 12 overlooking 106095 non-null object 13 Society 77853 non-null object 14 Bathroom 186703 non-null object 15 Balcony 138596 non-null object 16 Car Parking 84174 non-null object 17 Ownership 122014 non-null object 18 Super Area 79846 non-null object 19 Dimensions 0 non-null float64 20 Plot Area 0 non-null float64 dtypes: float64(3), int64(1), object(17) memory usage: 30.0+ MB
[9]:
array(['42 Lac ', '98 Lac ', '1.40 Cr ', ..., '1.5 Lac ', '24.4 Lac ',
'9.90 Cr '], dtype=object)[12]:
array(['42 Lac ', '98 Lac ', '1.40 *100 ', ..., '1.5 Lac ', '24.4 Lac ',
'9.90 *100 '], dtype=object)[14]:
array(['42 ', '98 ', '1.40 *100 ', ..., '1.5 ', '24.4 ', '9.90 *100 '],
dtype=object)[18]:
array([ 42. , 98. , 140. , ..., 1.5, 24.4, 990. ])
[21]:
array([ 6000., 13799., 17500., ..., 2873., 2663., 2508.])
[23]:
array([0.06 , 0.13799, 0.175 , ..., 0.02873, 0.02663, 0.02508])
[26]:
array(['500 sqft', '473 sqft', '779 sqft', ..., '1634 sqft', '164 sqyrd',
'136 sqft'], dtype=object)[27]:
| Index | Title | Description | location | Carpet Area | Status | Floor | Transaction | Furnishing | facing | ... | Society | Bathroom | Balcony | Car Parking | Ownership | Super Area | Dimensions | Plot Area | Amount(Lac) | Price (Lac) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 187488 | 187488 | 4 BHK Ready to Occupy Flat for sale in Motia A... | Creatively planned and constructed is a 4 BHK ... | zirakpur | 164 sqyrd | Ready to Move | 2 out of 3 | New Property | Semi-Furnished | East | ... | Motia Aero Greens | 3 | 2 | 1 Covered | Freehold | NaN | NaN | NaN | 72.0 | 0.04878 |
1 rows × 21 columns
[31]:
0
[36]:
187488 1476.0 Name: Carpet_Area_sqft, dtype: float64
[39]:
array(['1', '2', '3', '4', '6', nan, '5', '10', '9', '8', '> 10', '7'],
dtype=object)[40]:
0.4415270008691896
[41]:
Bathroom Price (Lac)
1 0.06000 858
0.10000 847
0.04889 834
0.04074 826
0.18000 824
0.03676 650
0.14000 650
0.02976 649
0.03793 648
0.04792 647
Name: count, dtype: int64[43]:
array(['1', '2', '3', '4', '6', '5', '10', '9', '8', '> 10', '7'],
dtype=object)[45]:
array([ 1, 2, 3, 4, 6, 5, 10, 9, 8, 7])
[48]:
array(['2', nan, '1', '3', '4', '> 10', '6', '5', '7', '10', '8', '9'],
dtype=object)[50]:
array(['2', '1', '3', '4', '> 10', '6', '5', '7', '10', '8', '9'],
dtype=object)[52]:
array([ 2, 1, 3, 4, 10, 6, 5, 7, 8, 9])
[55]:
array([nan, '680 sqft', '575 sqft', ..., '2066 sqft', '406 sqft',
'2332 sqft'], dtype=object)[63]:
1260.0
[64]:
array([ 680., 575., 600., ..., 2066., 406., 2332.])
[66]:
Index(['Index', 'Title', 'Description', 'location', 'Status', 'Floor',
'Transaction', 'Furnishing', 'facing', 'overlooking', 'Society',
'Bathroom', 'Balcony', 'Car Parking', 'Ownership', 'Dimensions',
'Plot Area', 'Amount(Lac)', 'Price (Lac)', 'Carpet_Area_sqft',
'Super_Area_sqft'],
dtype='object')[67]:
Index 187531 Title 32446 Description 65634 location 81 Status 1 Floor 947 Transaction 4 Furnishing 3 facing 8 overlooking 19 Society 10376 Bathroom 10 Balcony 10 Car Parking 229 Ownership 4 Dimensions 0 Plot Area 0 Amount(Lac) 1559 Price (Lac) 10958 Carpet_Area_sqft 2425 Super_Area_sqft 2619 dtype: int64
[68]:
Index 100.000000 Title 17.301673 Description 34.999013 location 0.043193 Status 0.000533 Floor 0.504983 Transaction 0.002133 Furnishing 0.001600 facing 0.004266 overlooking 0.010132 Society 5.532952 Bathroom 0.005332 Balcony 0.005332 Car Parking 0.122113 Ownership 0.002133 Dimensions 0.000000 Plot Area 0.000000 Amount(Lac) 0.831329 Price (Lac) 5.843301 Carpet_Area_sqft 1.293120 Super_Area_sqft 1.396569 dtype: float64
[69]:
Index 0.000000 Title 0.000000 Description 1.612000 location 0.000000 Status 0.327946 Floor 3.773776 Transaction 0.044259 Furnishing 1.544811 facing 37.451408 overlooking 43.425354 Society 58.485264 Bathroom 0.000000 Balcony 0.000000 Car Parking 55.114621 Ownership 34.936624 Dimensions 100.000000 Plot Area 100.000000 Amount(Lac) 5.163946 Price (Lac) 9.419776 Carpet_Area_sqft 0.792936 Super_Area_sqft 1.062758 dtype: float64
[72]:
0
[80]:
| Title | Description | location | Status | Floor | Transaction | Furnishing | facing | overlooking | Ownership | |
|---|---|---|---|---|---|---|---|---|---|---|
| count | 187531 | 184508 | 187531 | 186916 | 180454 | 187448 | 184634 | 117298 | 106095 | 122014 |
| unique | 32446 | 65634 | 81 | 1 | 947 | 4 | 3 | 8 | 19 | 4 |
| top | 2 BHK Ready to Occupy Flat for sale in Divyasr... | Multistorey apartment is available for sale. I... | new-delhi | Ready to Move | 2 out of 4 | Resale | Semi-Furnished | East | Main Road | Freehold |
| freq | 2106 | 2732 | 27599 | 186916 | 12433 | 144172 | 88318 | 54741 | 32193 | 112229 |
[81]:
| Bathroom | Balcony | Amount(Lac) | Price (Lac) | Carpet_Area_sqft | Super_Area_sqft | |
|---|---|---|---|---|---|---|
| count | 5.000000 | 5.000000 | 5.000000 | 4.000000 | 5.000000 | 5.0 |
| mean | 1.600000 | 1.600000 | 93.000000 | 0.140308 | 583.400000 | 680.0 |
| std | 0.547723 | 0.547723 | 59.050826 | 0.057607 | 125.416506 | 0.0 |
| min | 1.000000 | 1.000000 | 25.000000 | 0.060000 | 473.000000 | 680.0 |
| 25% | 1.000000 | 1.000000 | 42.000000 | 0.118493 | 500.000000 | 680.0 |
| 50% | 2.000000 | 2.000000 | 98.000000 | 0.156495 | 530.000000 | 680.0 |
| 75% | 2.000000 | 2.000000 | 140.000000 | 0.178310 | 635.000000 | 680.0 |
| max | 2.000000 | 2.000000 | 160.000000 | 0.188240 | 779.000000 | 680.0 |
[83]:
array(['Unfurnished', 'Semi-Furnished', 'Furnished', nan], dtype=object)
[85]:
| Furnishing | variable | value | |
|---|---|---|---|
| 0 | Unfurnished | Amount(Lac) | 42.0 |
| 1 | Semi-Furnished | Amount(Lac) | 98.0 |
| 2 | Unfurnished | Amount(Lac) | 140.0 |
| 3 | Unfurnished | Amount(Lac) | 25.0 |
[90]:
(1442, 16)
[92]:
178086
[94]:
df["Z_score"] = zscore(df["Bathroom"]) #zscore for number of bathrooms
[95]:
df["Z_score"].head() #zscore values
[95]:
1 -0.571879 2 -0.571879 3 -1.752920 4 -0.571879 5 -1.752920 Name: Z_score, dtype: float64
[96]:
df["Z_score"].min(),df["Z_score"].max()
[96]:
(-1.7529200054232745, 8.876452980042275)
##### Using a deviation of 3 and -3 intervals of std
Using a deviation of 3 and -3 intervals of std¶
[98]:
df = df[(df.Z_score > -3) & (df.Z_score <= 3)]
[99]:
df.shape
[99]:
(177766, 17)
[100]:
#Dropping the z_score column
df.drop("Z_score",axis=1,inplace=True)
[101]:
df.sample() #Random sample of the data
[101]:
| Title | Description | location | Status | Floor | Transaction | Furnishing | facing | overlooking | Bathroom | Balcony | Ownership | Amount(Lac) | Price (Lac) | Carpet_Area_sqft | Super_Area_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 145389 | 2 BHK Ready to Occupy Flat for sale in Aranya ... | This magnificent 2 BHK Flat is available for s... | pune | Ready to Move | 3 out of 6 | New Property | Semi-Furnished | East | Garden/Park, Main Road | 2 | 2 | Co-operative Society | 84.0 | 0.07778 | 820.0 | 1100.0 |
### Using IQR for outlier removal, for Price and Amount
Using IQR for outlier removal, for Price and Amount¶
[106]:
(-0.029, 0.164)
[111]:
(1.0, 1219.906500585935, 709222.0)
[117]:
(1.0, 1397.6167287370727, 10890.0)
[123]:
Index(['Title', 'Description', 'location', 'Status', 'Floor', 'Transaction',
'Furnishing', 'facing', 'overlooking', 'Bathroom', 'Balcony',
'Ownership', 'Amount(Lac)', 'Price (Lac)', 'Carpet_Area_sqft',
'Super_Area_sqft'],
dtype='object')[144]:
Index(['Bathroom', 'Balcony', 'Amount(Lac)', 'Price (Lac)', 'Carpet_Area_sqft',
'Super_Area_sqft'],
dtype='object')[153]:
(132526, 16)
[155]:
(500, 16)
[156]:
Title 0.0 Description 0.4 location 0.0 Status 0.6 Floor 0.6 Transaction 0.0 Furnishing 1.2 facing 35.4 overlooking 37.8 Bathroom 0.0 Balcony 0.0 Ownership 33.8 Amount(Lac) 0.0 Price (Lac) 0.0 Carpet_Area_sqft 0.0 Super_Area_sqft 0.0 dtype: float64
[159]:
| Title | Description | location | Status | Floor | Transaction | Furnishing | facing | overlooking | Bathroom | Balcony | Ownership | Amount(Lac) | Price (Lac) | Carpet_Area_sqft | Super_Area_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 305 | 2 BHK Ready to Occupy Flat for sale Kalyan West | 2 BHK flat available for sale in Thane in the ... | thane | Ready to Move | 2 out of 8 | Resale | Unfurnished | North | Garden/Park, Main Road | 2 | 2 | Freehold | 62.0 | 0.062 | 640.0 | 540.0 |
[169]:
Index(['Title', 'Description', 'location', 'Status', 'Floor', 'Transaction',
'Furnishing', 'facing', 'overlooking', 'Bathroom', 'Balcony',
'Ownership', 'Amount(Lac)', 'Price (Lac)', 'Carpet_Area_sqft',
'Super_Area_sqft'],
dtype='object')[172]:
| Title | Description | location | Status | Floor | Transaction | Furnishing | facing | overlooking | Bathroom | Balcony | Ownership | Price (Lac) | Carpet_Area_sqft | Super_Area_sqft | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 BHK Ready to Occupy Flat for sale in Dosti V... | One can find this stunning 2 BHK flat for sale... | thane | Ready to Move | 3 out of 22 | Resale | Semi-Furnished | East | Garden/Park | 2 | 2 | Freehold | 0.13799 | 473.0 | 680.0 |
| 1 | 1 BHK Ready to Occupy Flat for sale in Virat A... | Creatively planned and constructed is a 1 BHK ... | thane | Ready to Move | 2 out of 7 | Resale | Unfurnished | East | Garden/Park, Main Road | 1 | 1 | Co-operative Society | 0.06618 | 635.0 | 680.0 |
| 5 | 3 BHK Ready to Occupy Flat for sale in Pride P... | One can find this stunning 3 BHK flat for sale... | thane | Ready to Move | 3 out of 27 | Resale | Unfurnished | East | Garden/Park | 3 | 1 | Freehold | 0.11150 | 900.0 | 1165.0 |
[184]:
ColumnTransformer(transformers=[('num_pipe',
Pipeline(steps=[('impute', SimpleImputer()),
('scale', StandardScaler())]),
['Bathroom', 'Balcony', 'Carpet_Area_sqft',
'Price (Lac)', 'Super_Area_sqft']),
('cat_pipe',
Pipeline(steps=[('impute',
SimpleImputer(strategy='most_frequent')),
('one-hot-encoder',
OneHotEncoder(handle_unknown='ignore'))]),
['location', 'Transac...
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Title'),
('text_pipe2',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Description'),
('text_pipe3',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Status'),
('text_pipe4',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Floor'),
('text_pipe5',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'overlooking')])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ColumnTransformer(transformers=[('num_pipe',
Pipeline(steps=[('impute', SimpleImputer()),
('scale', StandardScaler())]),
['Bathroom', 'Balcony', 'Carpet_Area_sqft',
'Price (Lac)', 'Super_Area_sqft']),
('cat_pipe',
Pipeline(steps=[('impute',
SimpleImputer(strategy='most_frequent')),
('one-hot-encoder',
OneHotEncoder(handle_unknown='ignore'))]),
['location', 'Transac...
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Title'),
('text_pipe2',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Description'),
('text_pipe3',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Status'),
('text_pipe4',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Floor'),
('text_pipe5',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'overlooking')])['Bathroom', 'Balcony', 'Carpet_Area_sqft', 'Price (Lac)', 'Super_Area_sqft']
SimpleImputer()
StandardScaler()
['location', 'Transaction', 'Furnishing', 'facing', 'Ownership']
SimpleImputer(strategy='most_frequent')
OneHotEncoder(handle_unknown='ignore')
Title
CountVectorizer()
Description
CountVectorizer()
Status
CountVectorizer()
Floor
CountVectorizer()
overlooking
CountVectorizer()
[190]:
| best_performance | best_params | ||
|---|---|---|---|
| 0 | svr | 0.870398 | {'C': 1, 'degree': 3, 'gamma': 'scale', 'kerne... |
| 1 | GBR | 0.925321 | {'learning_rate': 0.5, 'n_estimators': 10} |
| 2 | RFR | 0.911580 | {'n_estimators': 100} |
| 3 | DTR | 0.810205 | {'splitter': 'best'} |
| 4 | KNN | 0.577278 | {'n_neighbors': 5} |
[194]:
VotingRegressor(estimators=[('svr', SVR(C=100, gamma='auto', kernel='linear')),
('gboost', GradientBoostingRegressor()),
('randomForest',
RandomForestRegressor(n_estimators=120)),
('DTR', DecisionTreeRegressor()),
('KNN', KNeighborsRegressor())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
VotingRegressor(estimators=[('svr', SVR(C=100, gamma='auto', kernel='linear')),
('gboost', GradientBoostingRegressor()),
('randomForest',
RandomForestRegressor(n_estimators=120)),
('DTR', DecisionTreeRegressor()),
('KNN', KNeighborsRegressor())])SVR(C=100, gamma='auto', kernel='linear')
GradientBoostingRegressor()
RandomForestRegressor(n_estimators=120)
DecisionTreeRegressor()
KNeighborsRegressor()
[197]:
Pipeline(steps=[('columntransformer',
ColumnTransformer(transformers=[('num_pipe',
Pipeline(steps=[('impute',
SimpleImputer()),
('scale',
StandardScaler())]),
['Bathroom', 'Balcony',
'Carpet_Area_sqft',
'Price (Lac)',
'Super_Area_sqft']),
('cat_pipe',
Pipeline(steps=[('impute',
SimpleImputer(strategy='most_frequent')),
('one-hot-encoder',
OneHotEncoder(handle_unkn...
CountVectorizer())]),
'Floor'),
('text_pipe5',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'overlooking')])),
('votingregressor',
VotingRegressor(estimators=[('svr',
SVR(C=100, gamma='auto',
kernel='linear')),
('gboost',
GradientBoostingRegressor()),
('randomForest',
RandomForestRegressor(n_estimators=120)),
('DTR', DecisionTreeRegressor()),
('KNN', KNeighborsRegressor())]))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('columntransformer',
ColumnTransformer(transformers=[('num_pipe',
Pipeline(steps=[('impute',
SimpleImputer()),
('scale',
StandardScaler())]),
['Bathroom', 'Balcony',
'Carpet_Area_sqft',
'Price (Lac)',
'Super_Area_sqft']),
('cat_pipe',
Pipeline(steps=[('impute',
SimpleImputer(strategy='most_frequent')),
('one-hot-encoder',
OneHotEncoder(handle_unkn...
CountVectorizer())]),
'Floor'),
('text_pipe5',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'overlooking')])),
('votingregressor',
VotingRegressor(estimators=[('svr',
SVR(C=100, gamma='auto',
kernel='linear')),
('gboost',
GradientBoostingRegressor()),
('randomForest',
RandomForestRegressor(n_estimators=120)),
('DTR', DecisionTreeRegressor()),
('KNN', KNeighborsRegressor())]))])ColumnTransformer(transformers=[('num_pipe',
Pipeline(steps=[('impute', SimpleImputer()),
('scale', StandardScaler())]),
['Bathroom', 'Balcony', 'Carpet_Area_sqft',
'Price (Lac)', 'Super_Area_sqft']),
('cat_pipe',
Pipeline(steps=[('impute',
SimpleImputer(strategy='most_frequent')),
('one-hot-encoder',
OneHotEncoder(handle_unknown='ignore'))]),
['location', 'Transac...
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Title'),
('text_pipe2',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Description'),
('text_pipe3',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Status'),
('text_pipe4',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'Floor'),
('text_pipe5',
Pipeline(steps=[('text_count',
CountVectorizer())]),
'overlooking')])['Bathroom', 'Balcony', 'Carpet_Area_sqft', 'Price (Lac)', 'Super_Area_sqft']
SimpleImputer()
StandardScaler()
['location', 'Transaction', 'Furnishing', 'facing', 'Ownership']
SimpleImputer(strategy='most_frequent')
OneHotEncoder(handle_unknown='ignore')
Title
CountVectorizer()
Description
CountVectorizer()
Status
CountVectorizer()
Floor
CountVectorizer()
overlooking
CountVectorizer()
VotingRegressor(estimators=[('svr', SVR(C=100, gamma='auto', kernel='linear')),
('gboost', GradientBoostingRegressor()),
('randomForest',
RandomForestRegressor(n_estimators=120)),
('DTR', DecisionTreeRegressor()),
('KNN', KNeighborsRegressor())])SVR(C=100, gamma='auto', kernel='linear')
GradientBoostingRegressor()
RandomForestRegressor(n_estimators=120)
DecisionTreeRegressor()
KNeighborsRegressor()
[198]:
0.9248792176433047
[203]:
11.7
[206]:
['house_prediction.pkl']
[209]:
[49.49247040602355, 46.325064720749445, 135.9023631614847, 115.13642342286353, 155.05089004532172, 148.1655839975253, 32.834095364328164, 124.36511853550692, 77.5755947517719, 102.9033532880168]
[211]:
[42.0, 19.0, 165.0, 109.0, 142.0, 130.0, 23.8, 130.0, 70.0, 109.0]
[212]:
0.9273919740050525
[ ]:
-
Variables
Callstack
Breakpoints
Source
9
1
Kernel Sources
Common Tools
No metadata.
Advanced Tools
No metadata.
Anaconda Assistant
AI-powered coding, insights and debugging in your notebooks.
To enable the following extensions, create an account or sign in.
- Anaconda Assistant4.0.15
- Coming soon!
- Data Catalogs
- Panel Deployments
- Sharing
Already have an account? Sign In
For more information, read our Anaconda Assistant documentation.
